"Macroeconomic Conditions When Young ShapeJob Preferences for Life"

Authors: Maria Cotofan, Lea Cassar, Robert Dur, Stephan Meier

Software used: STATA 16.0
Operating system: Windows 10

This file provides information on all the datasets and do files needed to reproduce the results in the paper. 

I. Raw data

1. Regional income levels are available on a yearly basis from the Bureau of Economic Analysis, since 1929. 

   This data can be found in the file income29.dta
   Original data can be downloaded from: https://www.bea.gov/

2. National unemployment rates are available on a yearly basis the Bureau of Labour statistics, since 1929.

   This data can be found in the file unemployment29.dta
   Original data can be downloaded from: https://www.bls.gov/

3. The General Social Survey 
   Smith, Tom W., Davern, Michael, Freese, Jeremy, and Morgan, Stephen L., General Social Surveys, 1972-2018 [machine-readable data file] /Principal Investigator, Smith, Tom W.; Co-Principal Investigators, Michael Davern, Jeremy Freese and Stephen L. Morgan; Sponsored by National Science Foundation. --NORC ed.-- Chicago: NORC, 2019.
   Original data can be downloaded from: https://gss.norc.org/
  
4. Data on Consumer Price Indexes is used to adjust for inflation, with base year 1982-1984 as defined by the US department of labor; 
   To adjust to 2017 US dollars, we use the 2017 CPI of 245.1

   This data can be found in the file cpi.dta
   Original data can be downloaded from: https://www.bls.gov/

5. Data on state-level and national-level population on a yearly basis is provided from the Bureau of Economic Analysis, since 1929.
   The GSS provides correspondence between the 9 regions in the survey and each US state.

   Data on state-level population size yearly and the correspondence between US states and GSS regions is provided in the RegionState.dta
   Data on the US population size yearly is provided in the file population.dta
   Original data on pupulation figures can be downloaded from: https://www.bea.gov/ 


II. Do-files 

1. macro_cleaning.do 
   This files prepares and cleans the raw data on income levels (income29.dta) and transforms it in the file RegionalIncome.dta
   This file prepares and cleanes the raw data on unemployment (unemployment29.dta) and transforms it in the file NationalUnemployment.dta

2. gss_cleaning.do
   This file prepares and cleanes the raw data from the file Gss.dta
   This file calculates experiences (national unemployment and regional income levels) during Impressionable Years for all respondents.These are our main explanatory variables.  
   This file merges all auxiliary raw data (see I.) and all constructed data (see II.1) to the GSS survey.
   This file creates the final dataset used in all the results of this paper: Gss_final.dta

3. results.do
   This file reproduces all results in the paper.
   By running this do-file one can reproduce:
	-Figure 1 in the paper
	-Tables 1 and 2 in the paper
	-Figures A1-A5 in the Web Appendix
	-Tables A1-A12 in the Web Appendix


III. Instructions for replication

1. Download the General Social Survey data from https://gss.norc.org/

This data should be saved as Gss.dta at it includes all waves up to and including 2016

2. Place all raw data files in one folder: 
	-income29.dta
	-unemployment29.dta
	-Gss.dta
	-cpi.dta
	-RegionState.dta
	-population.dta

3. In all do files change the path under cd "" to the folder path

4. Run macro_cleaning.do This will produce two new data files: RegionalIncome.dta and RegionalUnemployment.dta

5. Run gss_cleaning.do. This will produce the final dataset Gss_final.dta

6. Run results.do. This will replicate all the results in the paper.    

IV. Data Dictionary

All the do-files provide detailed descriptions and labels for the variables used in this analysis. 

Contains data from Gss_final.dta
  obs:        54,890                          
 vars:           164                          19 Mar 2021 13:22
--------------------------------------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------------------------------------------------
reg16           float   %9.0g                 Region at age 16
age             float   %9.0g                 Age
year            float   %9.0g                 Year
w_income_1825   float   %9.0g                 Income experience 18-25 (regionally, in $US)
w_income_1825US float   %9.0g                 Income experience 18-25 (nationally, in $US)
sd_income_1825  float   %9.0g                 Standard Deviation of Income experience 18-25 (regionally, in $US)
popUS           double  %8.0g                 US population
income_exp_1825 float   %9.0g                 Income experience 18-25 (regionally, in logs)
income_exp_18~S float   %9.0g                 Income experience 18-25 (nationally, in logs)
w_unemp         float   %9.0g                 Unemployment experience 18-25 (nationally, in %)
id              int     %8.0g                 respondent id number
wrkstat         byte    %8.0g      LABA       labor force status
wrkslf          byte    %8.0g      LABD       r self-emp or works for somebody
wrkgovt         byte    %8.0g      WRKGOVT    govt or private employee
occ10           int     %8.0g      LABF       r's census occupation code (2010)
indus10         int     %8.0g      LABG       r's industry code (naics 2007)
marital         byte    %8.0g      MARITAL    marital status
spwrksta        byte    %8.0g      LABA       spouse labor force status
childs          byte    %8.0g      CHILDS     number of children
educ            byte    %8.0g      LABK       highest year of school completed
paeduc          byte    %8.0g      LABK       highest year school completed, father
maeduc          byte    %8.0g      LABK       highest year school completed, mother
speduc          byte    %8.0g      LABK       highest year school completed, spouse
degree          byte    %8.0g      LABL       r's highest degree
padeg           byte    %8.0g      LABL       father's highest degree
madeg           byte    %8.0g      LABL       mothers highest degree
spdeg           byte    %8.0g      LABL       spouse's highest degree
sex             byte    %8.0g      SEX        respondents sex
race            byte    %8.0g      RACE       race of respondent
incom16         byte    %8.0g      INCOM16    r's family income when 16 yrs old
born            byte    %8.0g      LABB       was r born in this country
parborn         byte    %8.0g      PARBORN    were r's parents born in this country
hompop          byte    %8.0g      HOMPOP     number of persons in household
income          byte    %8.0g      LABU       total family income
rincome         byte    %8.0g      LABU       respondents income
region          float   %8.0g      REGION     region of interview
jobinc          byte    %8.0g      ranking    high income
jobsec          byte    %8.0g      ranking    no danger of being fired
jobhour         byte    %8.0g      ranking    short working hours
jobpromo        byte    %8.0g      ranking    chances for advancement
jobmeans        byte    %8.0g      ranking    work important and feel accomplishment
realinc         double  %12.0g     LABHW      family income in constant $
realrinc        double  %12.0g     LABHW      r's income in constant $
wtss            double  %12.0g     LABTT      weight variable
wtssnr          double  %12.0g     LABTT      weight variable
wtssall         double  %12.0g     LABTT      weight variable
vstrat          int     %8.0g      LABTU      variance stratum
vpsu            byte    %8.0g      LABTU      variance primary sampling unit
Agecohort       float   %9.0g                 Age cohort
birth           float   %9.0g                 Birth Year
employed        float   %9.0g                 In employment
workforce       float   %9.0g                 In the workforce
income_imputed  float   %9.0g                 imputed realinc
income_imputedR float   %9.0g                 imputed realrinc
lnincome        float   %9.0g                 imputed household income (logs)
lnincomeR       float   %9.0g                 imputed respondent income (logs)
income_na       float   %9.0g                 imputed household income (dummy)
income_naR      float   %9.0g                 imputed respondent income (dummy)
paeduc_imp      float   %9.0g                 imputed father education 
paeduc_m        float   %9.0g                 imputed father education (dummy)
maeduc_imp      float   %9.0g                 imputed mother education                 
maeduc_m        float   %9.0g                 imputed mother education (dummy)
incom16_imp     float   %9.0g                 imputed household income at age 16          
incom16_m       float   %9.0g                 imputed household income at age 16 (dummy)
wrkslf_imputed  float   %9.0g                 imputed wrkslf
wrkslf_na       float   %9.0g                 imputed wrkslf (dummy)
sqrt_hshsize    float   %9.0g                 Household size (squared)
unemploy_rate   float   %9.0g                 Unemployment rate (%)
income_capita~n float   %9.0g                 (mean) Regional income per capital
income_capita~s float   %9.0g                 (mean) US income per capita
cpi             float   %8.0g                 (mean) Consumer Price Index
population_sum  float   %9.0g                 (mean) US population
inccap_R        float   %9.0g                 Regional income level (logs)
inccap_US       float   %9.0g                 National income level (logs)
inccurr         float   %9.0g                 Regional income level
inccurr_us      float   %9.0g                 National income level

--------------------------------------------------------------------------------------------------------------------------



